Global Deaths Due to Air Pollution
Elizabeth Bekele, Alison Cheek
2022-05-03
Introduction
- Air pollution can be detrimental to both our health and the climate
- Outdoor and indoor air pollution cause chronic pain, respiratory
diseases, shortened lifespan
- Air pollution kills about 7 million people worldwide every year
- Hopefully this information can showcase the importance of air
pollution and that we should be more mindful about our planet
- Overview
- We will see how different air pollution types affect the
population
- compare past and present population numbers
- determine which air pollutant type has the highest associated death
rate
Packages Required
#This will allow us to filter through our data
library(tidyverse)
library(dplyr)
#This will help us plot figures to showcase our findings
library(ggplot2)
#This will help us organize and display our data as necessary
library(knitr)
library(kableExtra)
#This expands our plot uses
library(plotly)
#Scientific Notation Disabled
options(scipen=999)
Deaths Data
Import the deaths-due-to-air-pollution data
deaths_df <- data.frame(read.csv("death-rates-from-air-pollution.csv"))
We are going to rename a few of the columns and glimpse the data
colnames(deaths_df) <- c("country", "acronym", "year", "total_deaths", "indoor_deaths", "outdoor_deaths", "ozone_deaths")
glimpse(deaths_df)
## Rows: 6,468
## Columns: 7
## $ country <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanist~
## $ acronym <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",~
## $ year <int> 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1~
## $ total_deaths <dbl> 299.4773, 291.2780, 278.9631, 278.7908, 287.1629, 288.0~
## $ indoor_deaths <dbl> 250.3629, 242.5751, 232.0439, 231.6481, 238.8372, 239.9~
## $ outdoor_deaths <dbl> 46.44659, 46.03384, 44.24377, 44.44015, 45.59433, 45.36~
## $ ozone_deaths <dbl> 5.616442, 5.603960, 5.611822, 5.655266, 5.718922, 5.739~
Data Variables
Variables that interest us here include:
- country
- total_deaths: per 100,000
- indoor_deaths: Indoor air pollution is considered
pollution that occurs in the household. Cooking with solid fuels:
- Wood
- Crop waste, dung
- Charcoal, coal
- outdoor_deaths: Outdoor air or ambient air are
emissions caused by combustion processes from motor vehicles, solid fuel
burning and industry
- Ozone (O3)
- Particulate matter (PM10 and PM2.5)
- Nitrogen dioxide (NO2)
- Carbon monoxide (CO)
- Sulfur dioxide (SO2)
- ozone_deaths: Ozone is a gas that occurs both in
Earth’s upper atmosphere and at ground level. Ozone in the atmosphere is
an important and helpful greenhouse gas, but ground-level ozone is
created by extensive use of fossil fuels:
- Pollutants emitted by cars
- Power plants, industrial boilers, refineries, chemical plants
World Population Data
Now, let’s take a look at the population data.
world_pop <- read.csv("population_total_long.csv")
glimpse(world_pop)
## Rows: 12,595
## Columns: 3
## $ Country.Name <chr> "Aruba", "Afghanistan", "Angola", "Albania", "Andorra", "~
## $ Year <int> 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 196~
## $ Count <int> 54211, 8996973, 5454933, 1608800, 13411, 92418, 20481779,~
To get a general idea of ‘deaths-dataframe’ we made, let’s make a
plots to see what’s happening. This is a plot of indoor x outdoor deaths
around the world by country.
This is a mess, and so we chose two countries from each continent (a
high-population and a low-population country) to graph.
We selected a high population from each continent and used the
formula below to determine the low population.
Low population = high population * .10

|
Country.Name
|
Year
|
Count
|
|
Australia
|
1997
|
18517000
|
|
Brazil
|
1997
|
167209040
|
|
Germany
|
1997
|
82034771
|
|
Nigeria
|
1997
|
113457663
|
|
Pakistan
|
1997
|
131057431
|
|
United States
|
1997
|
272657000
|
|
|
Country.Name
|
Year
|
Count
|
|
Canada
|
1997
|
29905948
|
|
Chile
|
1997
|
14786220
|
|
Sri Lanka
|
1997
|
18470900
|
|
Malawi
|
1997
|
10264906
|
|
New Zealand
|
1997
|
3781300
|
|
Serbia
|
1997
|
7596501
|
|
Combine Data Sets
First let’s look at a table of the high and low populated countries
using the world population data set.
|
Country.Name
|
Year
|
Count
|
|
Australia
|
1997
|
18517000
|
|
Brazil
|
1997
|
167209040
|
|
Germany
|
1997
|
82034771
|
|
Nigeria
|
1997
|
113457663
|
|
Pakistan
|
1997
|
131057431
|
|
United States
|
1997
|
272657000
|
|
|
Country.Name
|
Year
|
Count
|
|
Canada
|
1997
|
29905948
|
|
Chile
|
1997
|
14786220
|
|
Sri Lanka
|
1997
|
18470900
|
|
Malawi
|
1997
|
10264906
|
|
New Zealand
|
1997
|
3781300
|
|
Serbia
|
1997
|
7596501
|
|
Next, we are going to see the death count for high and low populated
countries using the deaths dataframe.
|
country
|
acronym
|
year
|
total_deaths
|
indoor_deaths
|
outdoor_deaths
|
ozone_deaths
|
|
Australia
|
AUS
|
1997
|
22.43025
|
0.3222224
|
21.838737
|
0.3141838
|
|
Australia
|
AUS
|
1998
|
21.50529
|
0.2839769
|
20.960276
|
0.3048918
|
|
Australia
|
AUS
|
1999
|
20.40911
|
0.2590092
|
19.897091
|
0.2953354
|
|
Australia
|
AUS
|
2000
|
19.39822
|
0.2398763
|
18.909240
|
0.2899216
|
|
Australia
|
AUS
|
2001
|
18.58572
|
0.2234341
|
18.118700
|
0.2836469
|
|
Australia
|
AUS
|
2002
|
18.11849
|
0.2105980
|
17.662269
|
0.2859938
|
|
Australia
|
AUS
|
2003
|
17.23830
|
0.1937083
|
16.802536
|
0.2816949
|
|
Australia
|
AUS
|
2004
|
16.34770
|
0.1760229
|
15.932077
|
0.2785466
|
|
Australia
|
AUS
|
2005
|
15.41337
|
0.1599279
|
15.016089
|
0.2757150
|
|
Australia
|
AUS
|
2006
|
14.92239
|
0.1496469
|
14.530223
|
0.2819060
|
|
Australia
|
AUS
|
2007
|
14.92140
|
0.1449723
|
14.514884
|
0.3042005
|
|
Australia
|
AUS
|
2008
|
14.64683
|
0.1383225
|
14.228709
|
0.3254648
|
|
Australia
|
AUS
|
2009
|
14.11563
|
0.1259313
|
13.694572
|
0.3431982
|
|
Australia
|
AUS
|
2010
|
13.57171
|
0.1174834
|
13.140380
|
0.3647233
|
|
Australia
|
AUS
|
2011
|
13.72763
|
0.1119247
|
13.276676
|
0.3956796
|
|
Australia
|
AUS
|
2012
|
12.65973
|
0.1018626
|
12.196401
|
0.4192914
|
|
Australia
|
AUS
|
2013
|
11.87449
|
0.0973836
|
11.384154
|
0.4530427
|
|
Australia
|
AUS
|
2014
|
11.47268
|
0.0931036
|
10.939491
|
0.5037056
|
|
Australia
|
AUS
|
2015
|
11.27679
|
0.0886376
|
10.702072
|
0.5544068
|
|
Australia
|
AUS
|
2016
|
10.58644
|
0.0844017
|
9.974549
|
0.5955779
|
|
Australia
|
AUS
|
2017
|
10.79595
|
0.0833628
|
10.128111
|
0.6592419
|
|
Brazil
|
BRA
|
1997
|
57.64589
|
26.4634509
|
28.615177
|
3.3000853
|
|
|
country
|
acronym
|
year
|
total_deaths
|
indoor_deaths
|
outdoor_deaths
|
ozone_deaths
|
|
Canada
|
CAN
|
1997
|
21.92768
|
0.0877542
|
19.908473
|
2.1959403
|
|
Canada
|
CAN
|
1998
|
21.65538
|
0.0824492
|
19.634839
|
2.2056813
|
|
Canada
|
CAN
|
1999
|
21.17703
|
0.0751278
|
19.179045
|
2.1894261
|
|
Canada
|
CAN
|
2000
|
20.26486
|
0.0681836
|
18.326999
|
2.1277328
|
|
Canada
|
CAN
|
2001
|
19.82451
|
0.0641108
|
17.938427
|
2.0764642
|
|
Canada
|
CAN
|
2002
|
19.52428
|
0.0604824
|
17.669133
|
2.0476034
|
|
Canada
|
CAN
|
2003
|
19.17033
|
0.0564743
|
17.338627
|
2.0268644
|
|
Canada
|
CAN
|
2004
|
18.40919
|
0.0513588
|
16.629516
|
1.9730254
|
|
Canada
|
CAN
|
2005
|
17.79268
|
0.0481667
|
16.030102
|
1.9547116
|
|
Canada
|
CAN
|
2006
|
17.14391
|
0.0447622
|
15.445519
|
1.8887355
|
|
Canada
|
CAN
|
2007
|
16.93196
|
0.0435468
|
15.229981
|
1.8952587
|
|
Canada
|
CAN
|
2008
|
16.51814
|
0.0407468
|
14.829238
|
1.8832421
|
|
Canada
|
CAN
|
2009
|
15.76760
|
0.0380831
|
14.118647
|
1.8389200
|
|
Canada
|
CAN
|
2010
|
14.88338
|
0.0340653
|
13.281852
|
1.7864304
|
|
Canada
|
CAN
|
2011
|
14.59934
|
0.0319160
|
13.030477
|
1.7569979
|
|
Canada
|
CAN
|
2012
|
13.82968
|
0.0307105
|
12.243601
|
1.7647269
|
|
Canada
|
CAN
|
2013
|
12.97501
|
0.0288027
|
11.410021
|
1.7339970
|
|
Canada
|
CAN
|
2014
|
12.61872
|
0.0276959
|
11.032571
|
1.7469907
|
|
Canada
|
CAN
|
2015
|
12.21793
|
0.0270578
|
10.609097
|
1.7638948
|
|
Canada
|
CAN
|
2016
|
11.00267
|
0.0251286
|
9.397502
|
1.7408337
|
|
Canada
|
CAN
|
2017
|
10.71662
|
0.0247705
|
9.110733
|
1.7397181
|
|
Chile
|
CHL
|
1997
|
44.35418
|
12.3262645
|
31.559124
|
0.6260233
|
|
Lastly, we will join the population and and deaths with its respected
country.
|
country
|
acronym
|
year
|
total_deaths
|
indoor_deaths
|
outdoor_deaths
|
ozone_deaths
|
Count
|
|
Australia
|
AUS
|
1997
|
22.43025
|
0.3222224
|
21.838737
|
0.3141838
|
18517000
|
|
Australia
|
AUS
|
1998
|
21.50529
|
0.2839769
|
20.960276
|
0.3048918
|
18711000
|
|
Australia
|
AUS
|
1999
|
20.40911
|
0.2590092
|
19.897091
|
0.2953354
|
18926000
|
|
Australia
|
AUS
|
2000
|
19.39822
|
0.2398763
|
18.909240
|
0.2899216
|
19153000
|
|
Australia
|
AUS
|
2001
|
18.58572
|
0.2234341
|
18.118700
|
0.2836469
|
19413000
|
|
Australia
|
AUS
|
2002
|
18.11849
|
0.2105980
|
17.662269
|
0.2859938
|
19651400
|
|
Australia
|
AUS
|
2003
|
17.23830
|
0.1937083
|
16.802536
|
0.2816949
|
19895400
|
|
Australia
|
AUS
|
2004
|
16.34770
|
0.1760229
|
15.932077
|
0.2785466
|
20127400
|
|
Australia
|
AUS
|
2005
|
15.41337
|
0.1599279
|
15.016089
|
0.2757150
|
20394800
|
|
Australia
|
AUS
|
2006
|
14.92239
|
0.1496469
|
14.530223
|
0.2819060
|
20697900
|
|
Australia
|
AUS
|
2007
|
14.92140
|
0.1449723
|
14.514884
|
0.3042005
|
20827600
|
|
Australia
|
AUS
|
2008
|
14.64683
|
0.1383225
|
14.228709
|
0.3254648
|
21249200
|
|
Australia
|
AUS
|
2009
|
14.11563
|
0.1259313
|
13.694572
|
0.3431982
|
21691700
|
|
Australia
|
AUS
|
2010
|
13.57171
|
0.1174834
|
13.140380
|
0.3647233
|
22031750
|
|
Australia
|
AUS
|
2011
|
13.72763
|
0.1119247
|
13.276676
|
0.3956796
|
22340024
|
|
Australia
|
AUS
|
2012
|
12.65973
|
0.1018626
|
12.196401
|
0.4192914
|
22733465
|
|
Australia
|
AUS
|
2013
|
11.87449
|
0.0973836
|
11.384154
|
0.4530427
|
23128129
|
|
Australia
|
AUS
|
2014
|
11.47268
|
0.0931036
|
10.939491
|
0.5037056
|
23475686
|
|
Australia
|
AUS
|
2015
|
11.27679
|
0.0886376
|
10.702072
|
0.5544068
|
23815995
|
|
Australia
|
AUS
|
2016
|
10.58644
|
0.0844017
|
9.974549
|
0.5955779
|
24190907
|
|
Australia
|
AUS
|
2017
|
10.79595
|
0.0833628
|
10.128111
|
0.6592419
|
24601860
|
|
Brazil
|
BRA
|
1997
|
57.64589
|
26.4634509
|
28.615177
|
3.3000853
|
167209040
|
|
|
country
|
acronym
|
year
|
total_deaths
|
indoor_deaths
|
outdoor_deaths
|
ozone_deaths
|
Count
|
|
Canada
|
CAN
|
1997
|
21.92768
|
0.0877542
|
19.908473
|
2.1959403
|
29905948
|
|
Canada
|
CAN
|
1998
|
21.65538
|
0.0824492
|
19.634839
|
2.2056813
|
30155173
|
|
Canada
|
CAN
|
1999
|
21.17703
|
0.0751278
|
19.179045
|
2.1894261
|
30401286
|
|
Canada
|
CAN
|
2000
|
20.26486
|
0.0681836
|
18.326999
|
2.1277328
|
30685730
|
|
Canada
|
CAN
|
2001
|
19.82451
|
0.0641108
|
17.938427
|
2.0764642
|
31020902
|
|
Canada
|
CAN
|
2002
|
19.52428
|
0.0604824
|
17.669133
|
2.0476034
|
31360079
|
|
Canada
|
CAN
|
2003
|
19.17033
|
0.0564743
|
17.338627
|
2.0268644
|
31644028
|
|
Canada
|
CAN
|
2004
|
18.40919
|
0.0513588
|
16.629516
|
1.9730254
|
31940655
|
|
Canada
|
CAN
|
2005
|
17.79268
|
0.0481667
|
16.030102
|
1.9547116
|
32243753
|
|
Canada
|
CAN
|
2006
|
17.14391
|
0.0447622
|
15.445519
|
1.8887355
|
32571174
|
|
Canada
|
CAN
|
2007
|
16.93196
|
0.0435468
|
15.229981
|
1.8952587
|
32889025
|
|
Canada
|
CAN
|
2008
|
16.51814
|
0.0407468
|
14.829238
|
1.8832421
|
33247118
|
|
Canada
|
CAN
|
2009
|
15.76760
|
0.0380831
|
14.118647
|
1.8389200
|
33628895
|
|
Canada
|
CAN
|
2010
|
14.88338
|
0.0340653
|
13.281852
|
1.7864304
|
34004889
|
|
Canada
|
CAN
|
2011
|
14.59934
|
0.0319160
|
13.030477
|
1.7569979
|
34339328
|
|
Canada
|
CAN
|
2012
|
13.82968
|
0.0307105
|
12.243601
|
1.7647269
|
34714222
|
|
Canada
|
CAN
|
2013
|
12.97501
|
0.0288027
|
11.410021
|
1.7339970
|
35082954
|
|
Canada
|
CAN
|
2014
|
12.61872
|
0.0276959
|
11.032571
|
1.7469907
|
35437435
|
|
Canada
|
CAN
|
2015
|
12.21793
|
0.0270578
|
10.609097
|
1.7638948
|
35702908
|
|
Canada
|
CAN
|
2016
|
11.00267
|
0.0251286
|
9.397502
|
1.7408337
|
36109487
|
|
Canada
|
CAN
|
2017
|
10.71662
|
0.0247705
|
9.110733
|
1.7397181
|
36540268
|
|
Chile
|
CHL
|
1997
|
44.35418
|
12.3262645
|
31.559124
|
0.6260233
|
14786220
|
|


Death Count
Which country has the highest death count?
Let’s make a table depicting the high and low populated countries and
their respected death count due to pollution.
|
country
|
hp_average_death
|
|
Australia
|
17.76815
|
|
Brazil
|
48.42928
|
|
Germany
|
28.10988
|
|
Nigeria
|
112.30157
|
|
Pakistan
|
144.33463
|
|
United States
|
26.35827
|
|
|
country
|
lp_average_death
|
|
Canada
|
18.18542
|
|
Chile
|
36.51321
|
|
Malawi
|
147.77167
|
|
New Zealand
|
15.92536
|
|
Serbia
|
80.66558
|
|
Sri Lanka
|
69.60383
|
|
Here’s a graph to clearly visualize the previous table
So we’ve looked at the deaths due to pollution, but what percentage
of the population was affected?
|
Country.Name
|
average_population
|
|
Australia
|
21217772
|
|
Brazil
|
189132292
|
|
Germany
|
81914540
|
|
Nigeria
|
148549958
|
|
Pakistan
|
168525322
|
|
United States
|
300447600
|
|
|
Country.Name
|
average_population
|
|
Canada
|
33029774
|
|
Chile
|
16555805
|
|
Malawi
|
13605376
|
|
New Zealand
|
4214995
|
|
Serbia
|
7345882
|
|
Sri Lanka
|
19824652
|
|


Pollution Types
Which type of pollution has the greatest number of deaths?
## # A tibble: 6 x 4
## country avg_indoor avg_outdoor avg_ozone
## <chr> <dbl> <dbl> <dbl>
## 1 Australia 0.249 17.2 0.360
## 2 Brazil 19.4 26.8 2.74
## 3 Germany 0.717 25.5 2.34
## 4 Nigeria 75.9 35.2 2.12
## 5 Pakistan 87.7 50.5 10.4
## 6 United States 0.166 22.8 3.92
## # A tibble: 6 x 4
## country avg_indoor avg_outdoor avg_ozone
## <chr> <dbl> <dbl> <dbl>
## 1 Canada 0.0651 16.4 1.97
## 2 Chile 8.69 27.2 0.850
## 3 Malawi 132. 13.8 3.39
## 4 New Zealand 0.291 15.6 0.0728
## 5 Serbia 35.9 42.7 2.94
## 6 Sri Lanka 44.5 24.8 0.430
Pollution Over Time
Let’s look at the previous two decades and compare the death count
Has there been a change?
This is the first decade 1996-2006
|
country
|
High_Deaths_96
|
High_Deaths_01
|
High_Deaths_06
|
|
Australia
|
23.04465
|
18.58572
|
14.92239
|
|
Brazil
|
60.67757
|
49.46436
|
41.46829
|
|
Germany
|
34.72325
|
28.38756
|
23.83654
|
|
Nigeria
|
136.08978
|
123.05129
|
102.26653
|
|
Pakistan
|
155.42988
|
151.25352
|
146.09296
|
|
United States
|
29.99271
|
28.93114
|
25.93369
|
|
|
country
|
Low_Deaths_96
|
Low_Deaths_01
|
Low_Deaths_06
|
|
Australia
|
22.18101
|
19.82451
|
14.92239
|
|
Brazil
|
46.36829
|
37.43188
|
41.46829
|
|
Germany
|
183.14179
|
165.41702
|
23.83654
|
|
Nigeria
|
93.44700
|
83.18333
|
102.26653
|
|
Pakistan
|
85.28997
|
72.16239
|
146.09296
|
|
United States
|
100.66078
|
95.27073
|
25.93369
|
|
This is the second decade 2007-2017
|
country
|
High_Deaths_07
|
High_Deaths_12
|
High_Deaths_17
|
|
Australia
|
14.92140
|
12.65973
|
10.79595
|
|
Brazil
|
40.42460
|
35.39069
|
30.32108
|
|
Germany
|
23.45850
|
20.91536
|
19.82826
|
|
Nigeria
|
98.90306
|
84.22324
|
81.22147
|
|
Pakistan
|
143.81724
|
133.93887
|
123.21548
|
|
United States
|
25.11756
|
21.98194
|
18.82515
|
|
|
country
|
Low_Deaths_07
|
Low_Deaths_12
|
Low_Deaths_17
|
|
Canada
|
16.93196
|
13.82968
|
10.71662
|
|
Chile
|
30.53130
|
27.31475
|
24.29921
|
|
Malawi
|
132.12253
|
116.27470
|
104.93508
|
|
Serbia
|
76.65752
|
72.77354
|
62.57853
|
|
Sri Lanka
|
66.05987
|
59.22433
|
38.46264
|
|
Tonga
|
87.81178
|
79.49336
|
70.72940
|
|
Let’s graph the previous tables!
The first decade.
This shows the second decade.
Which year had the worst indoor? Outdoor particulate? Outdoor
ozone?
Indoor Deaths
Which is worse?
outdoor or indoor pollution?
Let’s reintroduce a graph we looked at earlier. Instead this time we
will combine the pollutant types together.
We cannot conclude which is worse.
- Low Populated Countries:
- High Populated Countries:
- Outdoor pollution seems to be more detrimental with the exception of
two countries in this sample set.
We have this included already
#Mean total deaths from 1996-2017 of high-population countries
deaths_highpop_countries <- deaths_df %>%
filter(country %in% c('United States', 'Brazil', 'Nigeria', 'Germany', 'Pakistan', 'Australia')) %>%
group_by(country) %>%
select(total_deaths) %>%
summarize(average_death_high = mean(total_deaths))
## Adding missing grouping variables: `country`
#Mean total deaths from 1990-2017 of high-population countries
deaths_lowpop_countries<- deaths_df %>%
filter(country %in% c('Canada', 'Chile', 'Malawi', 'Serbia', 'Sri Lanka', 'New Zealand')) %>%
group_by(country) %>%
select(total_deaths) %>%
summarize(average_death_low = mean(total_deaths))
## Adding missing grouping variables: `country`
#Average Death of both High and Low Populated countries
kable(list(deaths_highpop_countries, deaths_lowpop_countries))
|
country
|
average_death_high
|
|
Australia
|
17.76815
|
|
Brazil
|
48.42928
|
|
Germany
|
28.10988
|
|
Nigeria
|
112.30157
|
|
Pakistan
|
144.33463
|
|
United States
|
26.35827
|
|
|
country
|
average_death_low
|
|
Canada
|
18.18542
|
|
Chile
|
36.51321
|
|
Malawi
|
147.77167
|
|
New Zealand
|
15.92536
|
|
Serbia
|
80.66558
|
|
Sri Lanka
|
69.60383
|
|
ggplot(deaths_highpop_countries)+
geom_col(mapping = aes(x=country, y=average_death_high))+
xlab("Country")+
ylab("Average deaths (per 100,000)")+
ggtitle("Average total deaths in high-population countries")+
coord_flip()

ggplot(deaths_lowpop_countries)+
geom_col(mapping = aes(x=country, y=average_death_low))+
xlab("Country")+
ylab("Average deaths (per 100,000)")+
ggtitle("Average total deaths in low-population countries")+
coord_flip()

Summary
Sources